Additions to agent protocol #717

biglittlebigben · 2024-05-16T19:33:10Z

This PR:

Adds StartAgentJobRequest/StopAgentJobRequest calls to start an handle manually mange an agent lifecycle
Deprecates the JobType field in WorkerInfo as workers will now be expected to handle both kinds jobs (if the namespace supports both kinds in the first place)
Adds a metadata string -> string map to Job to allow agent implementation to take anonymous (to us) parameters
Adds a JobStatus and error fields to Job for status reporting (Similar to egress and ingress)
Makes livekit_agent.proto indentation consistent with other proto files
Add room configuration and management APIs to the Room service. Rooms configuration allow bundling most room configuration parameters under a single id, that can eventually be put as part of the room creation token.

changeset-bot · 2024-05-16T19:33:14Z

⚠️ No Changeset found

Latest commit: d82084e

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

💥 An error occurred when fetching the changed packages and changesets in this PR

Some errors occurred when validating the changesets config:
The package or glob expression "github.com/livekit/protocol" specified in the `fixed` option does not match any package in the project. You may have misspelled the package name or provided an invalid glob expression. Note that glob expressions must be defined according to https://www.npmjs.com/package/micromatch.

paulwe

lgtm 🚀

paulwe · 2024-05-17T18:42:06Z

protobufs/livekit_agent.proto

-    string participant_metadata = 6;
+  string participant_name = 4;
+  string participant_identity = 5;
+  string participant_metadata = 6;
 }

 message UpdateJobStatus {


this isn't implemented in the agents client... https://github.com/search?q=org%3Alivekit%20UpdateJobStatus&type=code

since every job will have a participant connection we could maybe remove this and use connection status?

if a participant is connected the job is ok

if the client disconnects uncleanly it was in error and we should reassign it

if the client disconnects cleanly the job is done and completed successfully

I think it may be OK, but we may want to implement Job metadata updates in the future (not a priority for now)

job metadata is immutable. the agent can use participant metadata or its own external storage

Stepping back: what information is conveyed in the participant* fields? Aren't the participants identified by the Job? If not, what participant is this referring to?

i was referring only to UpdateJobStatus - gh included irrelevant lines. to answer your question though - the agent service sends an availability request to the client. if the client is available to handle the job the response includes participant info used to create a token. the token is then passed back to the client in the job assignment

Ok. This PR deprecates the field for now. from the above, it looks like I should rather remove it entirely?

we should probably clean it up in lk first - the service implements it - but yeah, i think so

Sure. Next step is to implement this in lk anyway. Probably makes sense to do it before I merge this.

paulwe · 2024-05-17T18:43:38Z

protobufs/livekit_agent.proto

+message StartAgentJobRequest {
+  JobType type = 1;
+  Room room = 2;
+  optional ParticipantInfo participant = 3;


we probably only need publisher identity here

Same for just the room name mb?

theomonnom · 2024-05-17T19:31:59Z

protobufs/livekit_agent.proto

+  Room room = 2;
+  optional ParticipantInfo participant = 3;
+  string namespace = 4;
+  map<string, string> metadata = 5;


Can this just be a string? (let user serialize stuff if they want)

We currently use a map for egress, a string for other entities (participant, etc). Either has pros and cons. The map lets us potentially stash fields in there as well using a reserved namespace. The string is more flexible...

in cases where we might be tempted to extend the user supplied metadata with values from the service we should probably add structured fields. for example we might include an optional field for sip data with a struct containing phone #, trunk id, etc...

davidzhao · 2024-05-20T21:32:44Z

protobufs/livekit_models.proto

@@ -266,7 +266,8 @@ message SipDTMF {
 }

 message Transcription {
-  string participant_identity = 2;
+  // Participant that got its speech transcribed
+  string transcribed_participant_identity = 2;


seems redundant to me? everything in a Transcription should be scoped to the transcription.

That's a part of #706 probably. It was confusing to see two participant_identities in data packets in SDK.

I can see that.. would comments in the protobuf help to clear that up? Just looking at a Transcription object (without context on what is using the message), it seems to break some naming patterns to have transcribed_ again

Maybe source_ then?

davidzhao · 2024-05-20T21:34:07Z

protobufs/livekit_room.proto

@@ -59,11 +60,22 @@ service RoomService {

  // Update room metadata, will cause updates to be broadcasted to everyone in the room, Requires `roomAdmin`
  rpc UpdateRoomMetadata (UpdateRoomMetadataRequest) returns (Room);
+
+  // Create a room configuration. 
+  rpc CreateRoomConfiguration(CreateRoomConfigurationRequest) returns (RoomConfiguration);


worth discussing: should these be APIs on room service? or Project settings in Cloud (and config in OSS)?

if they are on room, we'd have to figure out permissions.. what token permissions would they require

A "configuration service" would work too. I have no strong opinion either way. Does anybody else have any input? Either way, we'll need to sort out permissions.

I would propose not to have a service for this initially. OSS would configure these in the config yaml. Cloud would store it per project.

How would cloud allow defining these configurations?

paulwe · 2024-05-21T09:51:48Z

protobufs/livekit_agent.proto

+  // The load that the job currently has on the worker
+  float load = 5;


as convenient as it would be to have this does the agent client have any way to give us job load info? isn't most of the work done in a single process? @theomonnom ?

Egress runs each worker in a separate process and reports load for each ingress job separately, but that took a not insignificant amount of effort to implement and may be a too high burden to put on all agent implementations. I agree we should delete it unless @theomonnom has some further input on how this would be used?

paulwe · 2024-05-21T09:56:56Z

protobufs/livekit_agent.proto

+  JobStatus status = 7;
+  string error = 8;
+  float load = 9;
+  int64 started_at = 10;
+  int64 ended_at = 11;
+  int64 updated_at = 12; 


i'm not sure we want any of these on Job - this is the job description passed to the worker to initialize a job.

This mixes Job Model and Execution status in the same message indeed. This is the pattern that is also used by Egress. Most consumers of the Job object need its current or latest state as well, which makes it convenient to have a message that makes it easy to send them together.

The IngressInfo message attempts to separate the Model and State somehow by packaging all the state related entries into a an IngressState message that is in turns included inside the Ingress message. The IngressState entry is null when the Ingress is used purely as model (which only happens at initialization).

An laternative here would be to flip the Job and JobState relationship compared to ingress:

message JobState { JobStatus status = 1; string error = 2; float load = 3; int64 started_at = 4; int64 ended_at = 5; int64 updated_at = 6; Job job = 7; }

this makes the most sense to me because the job config is an immutable property of the running process but for the sake of consistency we could compromise and copy ingress so they're at least separate objects

paulwe · 2024-05-23T02:08:53Z

protobufs/livekit_agent.proto

+message StartAgentJobRequest {
+  JobType type = 1;
+  string room_name = 2;
+  optional string participant_identity = 3;
+  string namespace = 4;
+  string metadata = 5;
+}


in the current implementation publisher jobs apply to all participants once they begin publishing. this doesn't fit into this api. multiple job ids could result from a single call and more jobs could start after the initial call.

i think we've accidentally overloaded the Job concept - in the original design this represented an invocation of the agent worker. what we're creating with this api call is a rule for dispatching jobs ie start a job for the room or start a job for publishers or for one publisher with the identity x.

maybe just StartAgent / StartAgentRequest ?

I think we have 2 options:

Change the Job definition to represent 1 invocation for either the whole room, or 1 participant

Keep the current Job definition and remove the participant_identity argument in the Start message (and rename the rpc indeed).

Is there a use case for starting an Agent Job for a single participant? At first glance, it seems so in context where there is a main speaker. Short of doing that, we may eventually need some kind of filter argument to tell a 3rd party agent what participants to process.

The current (before this PR) Job definition already has an optional ParticipantInfo field. How are agents currently expected to use it?

we'll want the option to specify one or more publishers a job should start for. we don't want the user to have to specify the identities of every publisher in the room if they want the job to start automatically for publishers though. i think we need a new concept to describe what this api operates on - i don't have a great name for it but WorkOrder fits the job/worker metaphor.

the publisher's ParticipantInfo is stored in this field by the agents framework when dispatching a publisher type job.

paulwe · 2024-06-26T17:04:01Z

can we add a field for the number of jobs active during the period the load sample is taken from to the UpdateWorkerStatus message.

what is the process for associating running jobs with a new worker connection when the websocket reconnects? should we expect a flood of MigrateJobRequest messages? maybe that message type should support migrating multiple jobs at once?

biglittlebigben · 2024-06-26T21:39:08Z

can we add a field for the number of jobs active during the period the load sample is taken from to the UpdateWorkerStatus message.

Added. We have to define what the active Jon count is though: the count at capture time, the max since last capture, ...?

what is the process for associating running jobs with a new worker connection when the websocket reconnects? should we expect a flood of MigrateJobRequest messages? maybe that message type should support migrating multiple jobs at once?

We have not implemented this anywhere yet AFAICT. I did change the message to use a list of ids though.

WiP

dddd8d0

biglittlebigben added 2 commits May 16, 2024 15:25

Start/StopAgentJob, Job status

e7256b7

Updates

c415198

biglittlebigben requested review from paulwe, theomonnom, frostbyte73 and davidzhao May 17, 2024 18:09

paulwe approved these changes May 17, 2024

View reviewed changes

theomonnom reviewed May 17, 2024

View reviewed changes

feedback

e482780

biglittlebigben force-pushed the benjamin/agents branch from e6cb753 to e482780 Compare May 17, 2024 21:01

feedback

2901cd8

biglittlebigben force-pushed the benjamin/agents branch from 7a896ce to 2901cd8 Compare May 17, 2024 22:06

theomonnom approved these changes May 17, 2024

View reviewed changes

WiP

c9cef7e

biglittlebigben force-pushed the benjamin/agents branch from 4021d70 to c9cef7e Compare May 20, 2024 19:59

Remove AgentInfo and WorkerInfo

6d19b66

biglittlebigben force-pushed the benjamin/agents branch from 804d62e to 6d19b66 Compare May 20, 2024 20:32

Put load into Job

15bbb5c

biglittlebigben force-pushed the benjamin/agents branch from 3138bc2 to 15bbb5c Compare May 20, 2024 21:11

davidzhao reviewed May 20, 2024

View reviewed changes

Add job start/end/update timestamps

e5e2337

biglittlebigben force-pushed the benjamin/agents branch from 6945cd5 to e5e2337 Compare May 20, 2024 21:54

timestamps in job update

7908d00

biglittlebigben force-pushed the benjamin/agents branch from 5f74d1a to 7908d00 Compare May 20, 2024 22:44

Job status enum

5a410f7

biglittlebigben force-pushed the benjamin/agents branch from 22089fc to 5a410f7 Compare May 20, 2024 23:37

paulwe reviewed May 21, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into benjamin/agents

13e271d

biglittlebigben force-pushed the benjamin/agents branch from f435a70 to 13e271d Compare May 21, 2024 18:27

feedback

101ca14

biglittlebigben force-pushed the benjamin/agents branch from f5a7b25 to 101ca14 Compare May 21, 2024 21:53

Require room in ListAgentJobs

b62ad8d

biglittlebigben force-pushed the benjamin/agents branch from 9cfdabe to b62ad8d Compare May 21, 2024 22:46

paulwe reviewed May 23, 2024

View reviewed changes

biglittlebigben force-pushed the benjamin/agents branch from f3112ae to b62ad8d Compare May 23, 2024 21:55

RoomDefinition

592e9e0

biglittlebigben force-pushed the benjamin/agents branch from 089fcd0 to 592e9e0 Compare May 24, 2024 19:08

Unmarshal RoomEgress

9903c97

biglittlebigben force-pushed the benjamin/agents branch from 90eb597 to 9903c97 Compare May 28, 2024 22:18

biglittlebigben added 2 commits May 28, 2024 16:09

RoomAgent

49bd88f

WiP

420a644

biglittlebigben force-pushed the benjamin/agents branch from a404b31 to 420a644 Compare June 3, 2024 21:37

biglittlebigben added 2 commits June 6, 2024 14:51

WiP

f913919

Merge remote-tracking branch 'origin/main' into benjamin/agents

edc26e4

biglittlebigben force-pushed the benjamin/agents branch from 773bbbe to edc26e4 Compare June 13, 2024 01:43

Merge remote-tracking branch 'origin/main' into benjamin/agents

2366dd8

biglittlebigben force-pushed the benjamin/agents branch from a0f422c to 2366dd8 Compare June 24, 2024 21:37

cleanup

d41f0fd

feedback

13fd91a

biglittlebigben force-pushed the benjamin/agents branch from c5934eb to 13fd91a Compare June 26, 2024 21:34

Merge remote-tracking branch 'origin/main' into benjamin/agents

e69197a

biglittlebigben force-pushed the benjamin/agents branch from 257638b to e69197a Compare June 27, 2024 17:26

generated protobuf

d82084e

biglittlebigben merged commit 82786f4 into main Jun 27, 2024
1 check passed

biglittlebigben deleted the benjamin/agents branch June 27, 2024 17:31

		// The load that the job currently has on the worker
		float load = 5;

Additions to agent protocol #717

Additions to agent protocol #717

Conversation

biglittlebigben commented May 16, 2024 • edited Loading

changeset-bot bot commented May 16, 2024 • edited Loading

⚠️ No Changeset found

paulwe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theomonnom May 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulwe May 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theomonnom May 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulwe May 23, 2024 • edited Loading

Choose a reason for hiding this comment

davidzhao May 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulwe commented Jun 26, 2024

biglittlebigben commented Jun 26, 2024

biglittlebigben commented May 16, 2024 •

edited

Loading

changeset-bot bot commented May 16, 2024 •

edited

Loading

theomonnom May 17, 2024 •

edited

Loading

paulwe May 17, 2024 •

edited

Loading

theomonnom May 17, 2024 •

edited

Loading

paulwe May 23, 2024 •

edited

Loading

davidzhao May 23, 2024 •

edited

Loading